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1*1 This paper describes a recent refinement of the 
machine-learning process employed by Samuel CD In connection 
with his development of a checker playing program, Samuel's 
checker player operates In much the same way a hunan player 
does; by look I ng ahead, and by making a qual I tat I ve 
judgement of the strength of the board pos! tlons It 
encounters. A machine learning process Is applied to the 
development of an accurate procedure for making thl s 
strength evaluation of board positions. Before discussing 
my modifications to Samuel's learning process, I should like 
to describe briefly Samuel's strength evaluation procedure/ 
and the associated learning process- 

1.2 Samuel's playing program assigns a strength value to 
a board position on the basis of the values of a fixed set 
of 51 parameters. An example of such a parameter is the 
degree to which the move leading to the position in question 
contributes to control of the center of the board. The 
strength value for a board position Is simply a weighted sum 
of the parameter values. Mathematical ly, the strength val ue 
Is a linear function of the parameter values: 

S(X |# X a V*S A i X i 

where the A$'s are the weighting factors. 

1.3 The purpose of Samuel's machine-learning process Is 
to select these parameters, and to arrive at the weighting 
factors, Thl s process Is embodl ed in a learni ng program 
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separate from the playing prolan, which anal »*es 
transcri bed games played by checker masters, More 

specifically, what Is analyze! are sets of all possible 
positions Immediately following a position occurring In the 
course of a transcribed game. Among these positions Is the 
one resulting fron the novo actually made by the checker 
master. The learning program assumes this to be the 
strongest position of the set and designates It as such. 

l # fc The remainder of the analysis is carried out on these 
sets of positions as follows. The value of each parameter 
Is computed for all positions in the set/ and the relation 
between a parameter's value for the strongest position and 
I ts values for other pos! t Ions Is noted. From thl s 

Information* col lected from many sets of positions, a 
correlat Ion coef Icient I s computed which Indicates the 
linear relation between the value of a particular parameter 
and the strength of corresponding board situations. For 

example^ a parameter such as piece advancement might often 
have a high value for the strongest position In a set of 
positions/ and a low value for for the weaker positions. 
This analysis would assign a high positive correlation 
coeflcient to such a parameter. Of the many parameters 

tested, only those with a high coeflcient were included 
among the 31 used by Samuftl In his playing program. The 

correlation coeficient for a parameter was used as Its 
wei ght Ing factor. 

1.5 The modified learning process here described is 
analogous to the selection of wei ght I ng factors In the 
process described above. No judgement Is made as to the 
utility of possible parameters; and exactly the set selected 
by Samuel are employed. The purpose of the learning process 
Is to aid In the construction of a function which assigns 
strength values to board situations. Again, the process Is 
based on an analysis of games ployed by checker masters. 

1.6 The essential difference between the modified 
learning process and the original Is that the strength 
function produced by the former Is not restricted to being 
linear. The use of a linear strength function Is equivalent 
to assuming that board strength varies linearly with the 
value of each parameter/ and that the parameters are 
themselves not Interrelated. Such assumptions are not 
entirely valid. Hence/ It was felt that a more accurate 
strength evaluation function might be produced by a learning 
technique less restricted as to what sort of function It 
could produce. This flexibility Is made possible by the 
use of tabulated functions as will now be described. 



2.1 The new scoring function Is defined in terms of eight 
auxiliary functions whose values are given In tables listing 
the value of each function for all possible sets of argument 
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values. Such a function is practical in terns of space 
requl renents I f the nuV»or of ar^u.nents I s sna 1 1 / and i f 
each argument can take on only a snail number of values* 
*jhe eight functions of this typ* which wt shall use have 
five arguments e*ch, and th*» arguments take on only the 
values 0,1, or 2. For such a function/ a table of only 2U3 
entrl es Is requl red. It should be noted that any function 
whatever of five three-valued arguments can be defined by 
such a table. 

2.2 For this sort of tabulated function thore exists a 
simple correspondence between five- tuples of argument values 
and locations In the table inhere a corresponding function 
value Is to be placed. This is best explained by the 
fol low! ng example. Consider, for such a function, the 
five- tuple of argument values 1,0,1,0/2. Regard this 
five-tuple as a base three integer, 10102, This number is 
92 base ten, and F(l,0,l,0,2) is located In the 92nd 
location of the function's table. 

2.3 The form of the new scoring function S 1 Is: 



l {X | ,X i ,,.,X Ji ) S SFi<A l t'n - A-"' Y iiS ) 



Where: 

a. Each of the elflht Fl's is a tabulated function as 
described above, 

b. Each Yij is one of the Xt's. Hence the five-tuple of 
argument values presented to a subf unction F [ Is a 
particular subset of the arguments presented to S'. 

c. The argument values presented to the function S f are 
the values of the 1st to 31st parameters selected by Samuel 
f r use by his playing program. However, they have been 
reduced to having the value If negative/ 1 if zero, and 2 
if positive. The reduction of argument values to the range 
0,1, or 2 is clearly necessitated by, the nature of the 
component F 's, since It is subsets of values of these 
parameters that are used as the arguments presented to these 
functions. The possibility of constructing a successful 
board-strength function which uses only this limited amount 
of information about a parameter value was suggested by the 
nature of the parameters themselves. The parameters tend to 
be only qualitative in nature, so that little information 
about what a parameter Is supposed to measure is gained from 
an exact numerical value. 

2,*» Let us consider a simple example of the operation of 
this function. Assume that for each I, Y», r , y;,i , . . ., Y* 1 ^ are 
chosen to be X, , X A , , . . X^respectl vel y. Note that this is a 

rather trivial case. Ha wish to demonstrate how the value 
of the function S 1 for the parameter values 

-5,-2, 1,-3,0,0, .. w Is computed. These values are first 
reduced to values 0,1, or 2: 0, 0,2,0,1, 1, .. .,1. For each 

1 the value s presented to the component F£ 's. I.e. the 
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values of Y^Y^,..,,^ a>*^ respectively 0,0,2,0,1. Thus: 

S 1 -F, (0, 0> 2, 0, 1 ) +F a (0, 0, 2, 0,l) + ...*F,O, 0,2,0,1) 

Since 00201 as a b3S» three integer is 13 base ton, the 
value of Fi(0, 9,2,0,1) Is located In the nth *ntry in the 
table of values for the I-th function. Thus the value of 
S 1 for the *iven arguments Is the sun of the l f Uh entries In 
each of the eight tables. 



3.1 The object of the modified learning procedure Is to 
determine appropriate tabulated function values from an 
analysis of transcribed games of checker masters. Again, 
sets of alternative positions, anon a which Is designated a 
Strongest^ ar^ recorded. For each position, the values of 
the 31 parameters are computed, and reduced to the val ues 
0,1, or 2. From this set of values, eight five-tuples of 
values are chosen, each five-tuple being the appropriate set 
of arguments for a component function. These f I ve- tuples 
are regarded as Integers whose value Is a serl3l location in 
a function-value table/ according to the scheme described in 
section 2,2. Hence to each position In the set of 
posi t ions corresponds eight locations/ one in each tabl e. 
At thhs stage of construction, each table location contains 
two components, a ri?ht and a left half, both initially 
zero. For each position in a set which Is not a strongest 
position, a one is accumulated In the rlsht half of each of 
the eij*ht table entries to which that position corresponds. 
For the strongest position In the set, a one Is accumulated 
in the left half of each of the corresponding table 
locations. This process i s repeated for thousands of sets 
of posi tions, 

3.2 At this point, the entry pairs are converted to the 
form In whhch they are to be used* This Is accomplished by 
dividing the left half by the right half, and replacing the 
half-entries by the quotient. Let us note what such a 
quotient represents. Consl der a particular entry in a 
particular table. To the table under consideration 
corresponds a set of five parameters, the values of whhch 
are used as the arguments of the function whose values are 
placed In this table. Call these parameters PI, P2, . . . , P5. 
To the entry In question corresponds a set of values for 
these parameters. Assume, for example that these values 
are 2,0,1,0,1. The reader may verify by referring to 
section 2.2 that we have In mind the 172nd table entry. The 
value of the table entry under consideration represents 
approximately the probabl 1 1 ty that a posi t Ion wl th the 
values 2,0,1,0,1 for the parameters P1,P2,...,P5 is a strong 
position. This may be seen by observing that ones were 
accumulated in the table entry under const deration exact I y 
when a position with the values 2,0,1,0,1 for^ parameters 
pi p2,..*,P5 was encountered In the reconstruction of book 
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games. Whenever such a posi tl on was designated as the 
strongest In its s*t, i ons was accumulated In the numerator 
of the fraction whose value was ultimately to replace the 
two half-entries. Similarly, for a non-stron*nst posi t Ion, 
a one was accumulated in the denoninator . Enough positions 
were examined that the numerators 3n-l denominators of these 
fractions were in general sign! f i cant . 

3.3 It would seem that to make full use of the type of 
function I have described, an optimal choice should be made 
for the parameter sets to he used as the arguments of the 
component functions. A rather elaborate technique was 

devised for this purpose; an-i the argument sets usel In the 
function whose performance I shall describe were arrived at 
by this technique. However, insufficient data exists to 
make any evaluation of this technique, and I shall not 
dcscri be i t here. 



k*l The performance of thl s function was ex tens 1 vely 
tested on actual checker si tuations. Again, tabulated 

games of checker masters were used* As before, sets of 

al ternat I ve posi t ions were recorded, wt th that of the 
checker master designated as strongest in the set. The 

function was applied to each position In a set, and the 
extent to which the scorins function a%rQzi with the checker 
master's opinion was noted. The scoring function was 

considered to have made an accurate evaluation for a set of 
positions If the highest or next-highest score in the set 
was assigned to the checker master's choice- An assignment 
of the next-highest score to the desl gnated strongest 
position was considered to be accurate, since often there 
are two best moves from a given position. Thus In many such 
cases the position gi ven the highest score is as strong as 
the one designated as strongest. The function was 

considered to have made a blunder for a set of positions if 
the checker master's choice was given a score below the 
median of the scores assigned to the positions In the set. 

J*. 2 The tables of function values were arrived at from an 
analysis of 12,000 sets of positions. The scoring function 
using these tables was tested on 1,000 positions which were 
not among those analyzed. The percentage of blunders was 
about 21, the percentage of accurate evaluations was about 
50, 28% perfect, and 22$ next perfect. 

i».3 Samuel measured the accuracy of his polynomial S in 
terms of a single Index. This Index has the value *100 if 
the assignment of scores to the positions in a set Is always 
such that the highest score Is assigned to the strongest 
position. The index takes on a value around If the 
scoring function Is performing no better than it would have 
by making a random assignment of scores to the positions in 
a set. For comparison purposes, this index was also 
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confuted for t K e function herein describe!. Tha function S r 
had 3 performance Index of 3^ on the sample mentioned In 
section *». 2. For the sue sample Samuel ' 5 S had a 

performance Index of about 27, 

4.4 The index mentioned in the last p + irar*raph al so 
provt ies an Indication of the relation between the 
performance of the new scoring function and the number of 
position sats used In constructing it. The function was 
constructed from the first 2,003 position se:s of the 12,000 
^escribed in 4.2. Then its performance Index on the test 
sample was computed. This procedure was repeated using the 
first it, 030 positions, the f i rst 6,000 positions, and 
finally with the entire sot of 12,000 positions. As might 
be expected, the more position sets used to generate the 
function, the better the performance, up to a certat n 
saturation poi nt. Thl s effect 1 s shown by the lower 1 ! ne in 
Fig. 1. 

4.5 This index was also computed for the performance of 
the function on a sample of position sets among those fron 
which the f unct ton had been constructed. Thi s sample 
consisted of the first 1,000 position sets of the 12,1100 
mentioned In 4.2. Again, the function was constructed from 
the first 2,000 position sets, the first l*,000, the first 
6,000 and all 12,000. The function's performance Index was 
computed in each case. The performance on this sample falls 
from an Initially very high value to somewhat above the 
value ultimately attained for performance on the sample 
mentioned In k+ 2. Thi s effect Is 1 1 lust rated by the upper 
line In fig. 1. 



5.1 The new function S 1 has shown itself to be more 
acc urate than Samuel ' s I i neir polynomial S. An even more 
accurate scoring function might be possible If the set of 51 
parameters is augmented by the add I t ion of non 1 i near 
parameters, or sets of Interrelated parameters. A non 

linear parameter Is one which, for example, takes on the 
value 2 or for strong positions and the value 1 for weak 
positions. An example of the behavior of a related pair of 
parameters, A and 3, Is as follows. The values 2 and 2 of A 
and B respectively, and the values and of A and B 
Indicate a strong board situation. Other combinations 

Indicate a weak position. Notice that parameters of this 
type could easily have very low linear correlations with 
board strength. They wbuld thus be rejected as useless by 
Samuel f s learning process. However they would not 

necessarily be useless when employed In connection with the 
sort of non 1 1 near function produced by the modified 
procedure. 
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